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Abstract 

Intuitively, the concept of similarity is the notion to measure an inexact matching 
between two entities of the same reference set. The notions of similarity and its close 
relative dissimilarity are widely used in many fields of Artificial Intelligence. Yet they have 
many different and often partial definitions or properties, usually restricted to one field of 
application and thus incompatible with other uses. This paper contributes to the design 
and understanding of similarity and dissimilarity measures for Artificial Intelligence. A 
formal dual definition for each concept is proposed, joined with a set of fundamental 
properties. The behavior of the properties under several transformations is studied and 
revealed as an important matter to bear in mind. We also develop several practical 
examples that work out the proposed approach. 

1 Introduction 

From a psychological point of view, a human being uses the notions of similarity and dissim- 
ilarity for problem solving, inductive reasoning, element categorization, or simply to search 
for information partially matching specific criteria. The ability to assess similarities between a 
newly given pattern and already known patterns is a distinctive feature of human thinking. 

It is therefore not strange that similarity and its dual concept dissimilarity are a funda- 
mental part of many theories and applications in several fields, within or related to Artificial 
Intelligence, like Case Based Reasoning pQ, Data Mining |[2], Information Retrieval [3], Pattern 
Matching [3] or Neural Networks, as the Radial Basis Function network [5]. Many applications 
are characterized by the use of metrics for measuring differences between objects. Metric dis- 
similarities have been deeply studied but they are tied to a particular transitivity expression 
based on the triangle inequality. Very often metric (distance) functions are used due to our 
natural understanding of Euclidean spaces. However, not all metrics are Euclidean and many 
interesting dissimilarities are non-metric. 
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In a general sense, similarity and dissimilarity express a dual comparison between two 
elements. We argue that every property of a similarity should have a correspondence with 
one property of a dissimilarity and vice versa. This duality is commonly ignored, as well 
as some annoying properties (e.g. transitivity) and there are few general studies about how 
transformations of a similarity or dissimilarity can alter their properties. To worsen matters, 
some properties that would look natural or fundamental -like symmetry or transitivity- are 
still under discussion (see e.g. [6], [7j, [8]). In summary, the lack of a basic agreed-upon theory 
sometimes leads to incompatible definitions or results focused on an specific kind of similarities 
or dissimilarities. 

The present work intends to make a further effort in the unification of both concepts (see, for 
example, [9]), in two basic ways. First, with a basic but fully operational definition of similarity 
and dissimilarity and a set of fundamental properties and transformations. And second, with a 
study of how these transformations change the properties of the similarities and dissimilarities. 

2 Preliminaries 

Let X be a non-empty set where an equality relation is defined. In a general sense, similarity 
and dissimilarity express the degree of coincidence or divergence between two elements of a 
reference set. Therefore, it is reasonable to treat them as functions since the objective is to 
measure or calculate this value between any two elements of the set. 

Definition 1. A similarity measure is an upper bounded, exhaustive and total function s : 
X x X — > I s C R with \I S \ > 1 (therefore I s is upper bounded and sup I s exists). 

Definition 2. A dissimilarity measure is a lower bounded, exhaustive and total function d : 
X x X — > Id C 1R with | i<2 1 > 1 (therefore Id is lower bounded and inf Id exists). 

Define now s max — sup I s and d m i n — inf Id. Without loss of generality, we can take 
Smax > and d min > 0. In any other case, a non-negative maximum or minimum can be 
obtained applying a simple transformation (e.g. s + \ s max \). The following are useful properties 
for these functions to fulfill. For conciseness, we introduce them for both kinds of functions at 
the same time. 

Refiexivity: s(x, x) = s max (implying sup I s G I s ) and d(x, x) = d min (implying inf I d G I d ). 
Strong Refiexivity: s(x, y) = s max <^> x = y and d(x, y) = d min <^> x = y. 
Symmetry: s(x,y) = s(y,x) and d(x,y) = d(y,x). 

Boundedness: A similarity s is lower bounded when 3a G R such that s(x,y) > a, for 
all x,y G X (this is equivalent to ask that inf I s exists). Conversely, a dissimilarity d is upper 
bounded when 3a G R such that d(x, y) < a, for all x, y G X (this is equivalent to ask that 
sup id exists). Given that \I S \ > 1 and \Id\ > 1, both inf I s ^ sup/ s and inf Id ^ sup/^ hold 
true. 

Closedness: Given a lower bounded function s, define now s m i n = inf I s . The property 
asks for the existence of x, y G X such that s(x, y) = s min (equivalent to asking that inf I s G I s ). 
Given an upper bounded function d, define d max = sup Id- The property asks for the existence 
of x,y G X such that d(x,y) = d max (equivalent to asking that sup/^ G Id). 
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Complementarity: Consider now a function C : X — > 2 X . A lower closed similarity s 

defined in X has complement function C(x) = {x' G X/ s(x,x') = s m « n }, if Vx,x' G X, \C(x)\ = 
\C(x')\ 7^ 0. An upper closed dissimilarity d defined in X has complement function C, where 
C(x) = {x' G X/ d(x,x') = d max }, if Wx,x' G X, \C(x)\ = \C(x')\ 7^ 0. In case s or d are 
reflexive, necessarily x ^ C(x). Each of the elements in C(x) will be called a complement of x. 
Moreover, s or d have unitary complement when Vx G X, |C(a;)| = 1. In this case, Vx G X: 

For similarities: 3y'/ s(x,y') = s max <^=^ 3y'/ y' G C(y),Vy G C(x) 
For dissimilarities: 3y'/ d(x,y') = d min <^=^> 3y'/ G C(y),Wy G C(x) 

Let us define a transitivity operator in order to introduce the transitivity property in simi- 
larity and dissimilarity functions. 

Definition 3. (Transitivity operator). Let I be a non-empty subset o/R ; and let e be a fixed 
element of I. A transitivity operator is a function r : I x / — > I satisfying, for all x,y,z G I: 

1. t{x, e) = x (null element) 

2. y < z =>• r(x,y) < t(x,z) (non- decreasing monotonicity) 

3. r(x,y) = r(y,x) (symmetry) 

4- r(x,r(y, z)) — r(r(x,y), z) (associativity) 

There are two groups of transitivity operators: those for similarity functions, for which e = 
sup / = s max (and then I is I s ) and those for dissimilarity functions, for which e = inf / = d m i n 
(I is Id)- It should be noted that this definition reduces to uninorms [TO] when / = [0, 1]. 

Transitivity: A similarity s defined in X is called r s -transitive if there is a transitivity 
operator r s such that the following inequality holds: 

s(x,y) > r s (s(x,z),s(z,y)) Vx,y,z G X 

A dissimilarity d defined in X is called r^-transitive if there is a transitivity operator such 
that the following inequality holds: 

d(x, y) < T d (d(x, z), d(z, y)) Wx, y,z G X 

A similarity or dissimilarity in X may be required simply to satisfy strong reflexivity and 
symmetry. It is not difficult to show that strong reflexivity alone implies a basic form of 
transitivity [TT]. We call S(X) the set of all similarity functions and A(A) the set of all 
dissimilarity functions defined over elements of X. 

3 Equivalence 

In this section we tackle the problem of obtaining equivalent similarities or dissimilarities, and 
to transform a similarity function onto a dissimilarity function or vice versa, which will naturally 
lead to the concept of duality. 
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3.1 Equivalence functions 

Consider the set of all ordered pairs of elements of X and denote it X x X. Every s G S(X) 
induces a preorder relation in X x X. This preorder is defined as "to belong to a class of 
equivalence with less or equal similarity value". Formally, given X and s G S(X), we consider 
the preorder ^ given by 

(x, y) r< (x' : y') ^ s(x, y) < s(x\ y'), V(x, y), (x', y') G X x X 

Analogously, every d G A(X) induces the preorder "to belong to a class of equivalence with 
less or equal dissimilarity value" . Recall that (x, y) -< (w, z) and (w, z) ■< (x, y) does not imply 
x = w and y = z. 

Definition 4. (Equivalence). Two similarities (or two dissimilarities) defined in the same 
reference set X are equivalent if they induce the same preorder. 

Note that the equivalence between similarities or between dissimilarities is an equivalence 
relation. The properties of similarities and dissimilarities are kept under equivalence, including 
transitivity. The exception is the boundedness property which will depend on the chosen 
equivalence function. Only the monotonically increasing and invertible functions keep the 
induced preorder. 

Definition 5. (Equivalence function). Let s be a similarity and d a dissimilarity. An equiv- 
alence function is a monotonically increasing and invertible function f such that f o s is a 
similarity equivalent to s. Analogously, f o d is a dissimilarity equivalent to d. 

Theorem 1. Let s\ be a transitive similarity and d\ a transitive dissimilarity. Denote by r Sl 
and Tdj_ their respective transitivity operators. Let f be an equivalence function. Then: 

1. The equivalent similarity s 2 = f o si is t ^-transitive, where 
r S2 (a,b) = f(r Sl (f-\a)J-\b))) Va, b G I S2 

2. The equivalent dissimilarity d 2 = f o d\ is t d 2 - transitive, where 
r d2 (a,b) = /(^(/-^a),/- 1 ^))) Wa,be h 2 

Proof. Consider only the similarity case, in which / : I Sl — >■ I S2 . Using the transi- 
tivity of si we know that, for all x,y,z G X, si(x,y) > t Si (si(x, z), si(z, y)). 
Applying / to this inequality we get 

(/ o Sl )£x, y) > (/or sl )(si(x,z),Si(z,y)). 
Using f^ 1 o S2 = si, we get 

s 2 (x,y) > (foT 8l ) (if' 1 o s 2 )(x, z), (f- 1 o s 2 )(z,y)^ . 

Defining r S2 as is defined in the Theorem we get the required transitivity expres- 
sion s 2 (x,y) > T S2 (s 2 (x,z),s 2 (z,y)). 

Therefore, any composition of an equivalence function and a similarity (or dissimilarity) 
function is another similarity (or dissimilarity) function, which is also equivalent. 
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3.2 Transformation functions 

Equivalence functions allow us to get new similarities from other similarities or new dissimi- 
larities from other dissimilarities, but not to switch between the former and the latter. Denote 
by E*(X) the set of similarities defined in X with codomain on [0,1] and by A*(X) the set of 
such dissimilarities. As we shall see, using appropriate equivalence functions /*, we have a way 
to get equivalent similarities (resp. dissimilarities) on E*(AT) (resp. A*(X)) using similarities 
or dissimilarities in S(X) (resp. A(X)) and vice versa. In consequence, defining properties in 
E(X) or A(X) is tantamount to defining them in E*(X) or A*(X), respectively. 

Definition 6. A [O,l]-transformation function h is a decreasing bijection on [0,1] (implying 
that n(0) = l,h(l) = 0, continuity and the existence of an inverse). A transformation function 
h is involutive if h~ l = n. 

This definition is restricted to (resp. dissimilarities) on E*(X) (resp. A*(X)). Using that 
both /* and n are bijections, a general transformation function between elements of E(X) (resp. 
A(X)) is the composition of two or more functions in the following way: 

Definition 7. A transformation function f is the composition of two equivalence functions and 
a [0, l]-trans formation function: 
f = /* o h o f* , 

where n is a transformation function on [0,1], f* obtains equivalent similarities (resp. dis- 
similarities) mS(I) (resp. A(X)) and f^ obtains equivalent similarities (resp. dissimilarities) 
in E*(X) (resp. A*(X)). 

4 Duality 

As it has been shown along this work, similarity and dissimilarity are two interrelated concepts. 
In fuzzy theory, t-norms and t-conorms are dual with respect to the fuzzy complement [12J. 
In the same sense, all similarity and dissimilarity functions are dual with respect to some 
transformation function. 

Definition 8. (Duality). Consider s G E(X),d G A(X) and a transformation function f : 
I s — > Id- We say that s and d are dual by f if d = f o s or, equivalently, if s = f^ 1 o d. This 
relationship is written as a triple -< s,d, f y. 

Theorem 2. Given a dual triple -< s,d, f y, 

1. d is strongly reflexive if and only if s is strongly reflexive. 

2. d is closed if and only if s is closed. 

3. d has (unitary) complement if and only if s has (unitary) complement. 

4- d is Td-transitive only if s is r s -transitive, where 
r d (x,y) = /(r^/- 1 ^),/- 1 ^))) \/x,y G I d 
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Proof. Take s G S(X) and make d — f o s. 

1. For all x,y G X such that x ^ y, we have s(x,y) ^ s max ; hence, applying /, 
we obtain d(x,y) ^ d min . 

2. Symmetry is immediate. 

3. For all x, y G X, we have s(x, y) > s min . Suppose s is closed. Since / is strictly 
monotonic and decreasing, s(x,y) > s min (/ o s)(x,y) < f(s min ). Then s 
is closed because there exist x,y G X such that s(x,y) = s m i n , only true if 
(/ o s)(x,y) = f(s min ) (i.e. if d is closed). 

4. For all G X such that x' G C(x), we have s(x,x f ) = s m j n ; applying /, we 
have (/ o s)(x,x') = /(s m m); that is, d(x,x') = d max . Therefore, complemen- 
tarity is kept. 

5. For transitivity, see [12], Theorem 3.20, page 84. 



Thanks to this explicit duality relation, properties on similarities are immediately translated 
to dissimilarities, or viceversa. A general view of all the functions and sets appeared so far is 



represented in Fig. 4.1 




Figure 4.1: Graphical representation of equivalence (/) and transformation (/) functions from 
and within S(X) and A(X). 



5 Application examples 

In this section we develop some simple application examples for the sake of illustration. 
Example 1. Consider the dissimilarities in £([0,1]) given by 

d±(x, y) = \x — y\, d 2 (x,y) = min(x,y). 

Their respective transitivity operators are ^(a, b) = rainfl, a + b) and Td 2 (a, b) = min(a, b). 
Consider the family of transformation functions: f(z) = (1 — z) l / a , with a ^ 0. The corre- 
sponding dual similarities are: 

Sl (x,y) = (1 - \x - y\) 1/a , s 2 (x,y) = max((l - x) 1 ^, (1 - y) 1 ^). 
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Using Theorem 2, the corresponding transitivity operators are T Sl (a,b) = max(a a + b a — 
l,0) 1//a and r S2 (a,b) = max(a,b). Therefore, two dual triples are formed: -< si,di,f y and 
s 2) d>2, f y. Note that r si corresponds to a well-known family of t-norms, whereas r S2 is the 
max norm. When a = 1, the transitivity of Si is the Lukasiewicz t-norm 



Example 2. Consider the similarity defined in S(Z) given by s(x,y) = 1 — ■ dn this 

case the set I s is the set of all rational numbers in (0,1], sup/ s = 1 and inf I s = 0. This 
function satisfies strong reflexivity and symmetry. Moreover, it is lower bounded (with s m i n = 
), although it is not lower closed. For this reason, it does not have a complement function. 

What transitivity do we have here? We know that \x — y\ is a metric. Consider now the 
transformations hk(z) = z/(z + k), for k > 0. Since hk is subadditive, rik(\x — y\) is also a 
metric dissimilarity. Therefore, 
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If we apply now the transformation n(z) = 1 — z, we obtain the original expression for 
the similarity s. Using Theorem 2, the transitivity finally changes to s(x,y) = max{s(x, z) + 
s(z,y)-\A)}. 

Example 3. Consider the function d(x, y) = e' x_s/ ' — 1 . This is a strong reflexive and symmetric 
dissimilarity in A(R) with codomain = [0, +oo). Therefore, it is an unbounded dissimilarity 
with d min = 0. This measure can be expressed as the composition of f(z) = e z — 1 and d'(x,y) = 
\x — y\ . Thus, it is t ^-transitive with Td(a, b) = ab + a + b. Consequently, 

d(x, y) < d(x, z) + d(z, y) + d(x, z) ■ d(z, y), Vx, y, z EM. 
To see this, use that d! is d' ' -transitive with r^(a, b) = a + b and apply Theorem 1: 



r d {a, b) = f(T d ,[f-\a), f-\b)}) = e Mi+a )+ in { i +h) _ l = ( i + a) ( l + h) _ l = ah + a + h 

Consider now the equivalence function / : [0, oo) — > [0, oo) given by f(z) = ln{z + 1) and 
apply it to the previously defined dissimilarity d. The result is the equivalent dissimilarity 
d'(a,b) — \x — y\, the standard metric in R, transitive with Td>(a,b) = a + b (this is the 
transitivity leading to the triangular inequality for metrics). The important point is that d' 
is also Trf-transitive, since a + b < a + b + ab when a, b G [0, oo). This is due to a gradation 
in the restrictiveness of transitivity operators [12]. In this case, d' is more restrictive than d 
and therefore, transitivity with the former operator implies transitivity with the latter, but not 
inversely. 

If we apply now f'(z) = z 2 to d' what we get is an equivalent dissimilarity d"(x, y) = (x—y) 2 , 
again strongly reflexive, symmetric and <i"-transitive, where r^/(a, 6) = -y/a 2 + b 2 . In this case, 
d" is more restrictive than both d' and d. 

Similarity and dissimilarity unify preservation of transitivity using equivalence functions. 
This fact can be used, for example, to get a metric dissimilarity from a non-metric one. In the 
following example we compare the structure of two trees with a non-metric dissimilarity. Upon 
application of an equivalence function we get an equivalent and metric dissimilarity function. 
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Figure 5.1: A simple coding of binary trees. The reason for going bottom-up is to have the 
less significative digits close to the root of the tree. The choice of making the left nodes more 
significant than the right ones is arbitrary. The symbol represents the empty tree. 



Example 4. Consider a dissimilarity function between two binary trees. It does not measure 
differences between nodes but the structure of the tree. Consider a simple tree coding function 
D that assigns a unique value for each tree. This value is first coded as a binary number of 
length 2 h — 1, being h the height of the tree. The reading of the code as a natural number is the 
tree code. The binary number is computed such that the most significant bit corresponds to the 



leftmost and bottommost tree node (Fig. 5.1). Note that D is not a bisection, since there are 
numbers that do not code a valid binary tree. 

Consider now the following dissimilarity function, where A and B are binary trees. The 
symbol represents the empty tree with value 0. 



d(A, B) 



max (§p§> S(fj) if A ^ (Z) and B ^ Q 

1 if A = and B = 

D(A) if A^Q) and B = 

D(B) ifA = and B ^ 



This is a strong reflexive, symmetric, unbounded dissimilarity with I d = [1, oo) with d min = 
1. If we impose a limit H to the height of the trees, then d is also upper bounded and 

2 H -1 

closed, d max = ^ 2*. It is also transitive with the product operator, which is a transitivity 

i=0 

operator valid for dissimilarities defined in [1, oo); in other words, for any three trees A, B, C, 
d(A,B)<d(A,C)-d(C,B). 

Proof. If neither of A, B or C are the empty tree, substituting in the previous 
expression and operating with max and the product we get: 

max ^] < max ggf D(A)D(B) D(B) \ 

\D(B) ' D{A) ) ~ \D{B) ' D{A)D{B) ' D(C) 2 ' D{A) ) 

which is trivially true. Now, if A = 0, then the inequality reduces to D(B) < 
max (d(B), D (*a)d(b) ) • ^ ne cases 5 = or C = can be treated analogously. 
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If we apply now the equivalence function f(z) = log z to d we shall receive a dissimilarity 
d' — fod, where the properties of d are kept in d'. However, the transitivity operator is changed 
using Theorem 1, to Td>(a,b) = a + b. In other words, we obtain a metric dissimilarity over 
trees fully equivalent to the initial choice of d. 

6 Conclusions 

The main goal of this paper has not been to set up a standard definition of similarity and 
dissimilarity, but to establish some operative grounds on the definition of these widely used 
concepts. The data practitioner can take (or leave) the proposed properties as a guide. We 
have studied some fundamental transformations in order to keep these chosen basic properties. 
In particular, we have concentrated on transitivity and its preservation. However, a deeper 
study has to be done about the effects of transformations, specially in transitivity (e.g. which 
transformations do keep the triangle inequality) and more complex matters, like aggregation of 
different measures into a global one. Due to the many fields of application these concepts are 
involved with, the study of their properties can lead to better understanding of similarity and 
dissimilarity measures in many areas. 
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